Distributed Parallel Computing in Mermera: Mixing Noncoherent Shared Memories

نویسندگان

  • Abdelsalam Heddaya
  • Himanshu Sinha
چکیده

Programmers of parallel processes that communicate through shared globally distributed data structures (DDS) face a di cult choice. Either they must explicitly program DDS management, by partitioning or replicating it over multiple distributed memory modules, or be content with a high latency coherent (sequentially consistent) memory abstraction that hides the DDS' distribution. We present Mermera, a new formalism and system that enable a smooth spectrum of noncoherent shared memory behaviors to coexist between the above two extremes. Our approach allows us to de ne known noncoherent memories in a new simple way, to identify new memory behaviors, and to characterize generic mixed-behavior computations. The latter are useful for programming using multiple behaviors that complement each others' advantages. On the practical side, we show that the large class of programs that use asynchronous iterative methods (AIM) can run correctly on slow memory, one of the weakest, and hence most e cient and fault-tolerant, noncoherence conditions. An example AIM program to solve linear equations, is developed to illustrate: (1) the need for concurrently mixing memory behaviors, and, (2) the performance gains attainable via noncoherence. Other program classes tolerate weak memory consistency by synchronizing in such a way as to yield executions indistinguishable from coherent ones. AIM computations on noncoherent memory yield noncoherent, yet correct, computations. We report performance data that exempli es the potential bene ts of noncoherence, in terms of raw memory performance, as well as application speed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

S a I S T Bo S T O N Using Warp to Control Network Contention in Mermera

Parallel computing on a network of workstations can saturate the communication network, leading to excessive message delays and consequently poor application performance. We examine empirically the consequences of integrating a ow control protocol, called Warp control [Par93], into Mermera, a software shared memory system that supports parallel computing on distributed systems [HS93]. For an as...

متن کامل

Memory sharing for interactive ray tracing on clusters

We present recent results in the application of distributed shared memory to image parallel ray tracing on clusters. Image parallel rendering is traditionally limited to scenes that are small enough to be replicated in the memory of each node, because any processor may require access to any piece of the scene. We solve this problem by making all of a cluster’s memory available through software ...

متن کامل

Loop Parallelism on Tera MTA Using Sisal

The difficulty of programming parallel computers has impeded their wide-spread use. The problems are caused by existing hardware and software tools. The software problems on shared-memory and vector computers can be solved by using deterministic high-performance functional languages like SISAL. Distributed-memory computers have even more obstacles than shared-memory parallel machines. Research ...

متن کامل

Understanding the Behavior of Shared Memory Applications Using the SMiLE Monitoring Framework

Data locality is a key factor for the performance of parallel systems. In a Distributed Shared Memory (DSM) system, however, it is difficult for the users to maintain a high data locality as it is usually a priori unknown how the data is distributed among the nodes. In this article we introduce a monitoring framework that allows users to understand the memory behavior of parallel applications. ...

متن کامل

Using Memory-Mapped Network Interfaces to Improve the Performance of Distributed Shared Memory

Shared memory is widely believed to provide an easier programming model than message passing for expressing parallel algorithms. Distributed Shared Memory (DSM) systems provide the illusion of shared memory on top of standard message passing hardware at very low implementation cost, but provide acceptable performance for only a limited class of applications. We argue that the principal sources ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996